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The Finite Ridgelet Transform 
for Image Representation 

Minh N. Do, Member, IEEE, and Martin Vetterli, Fellow, IEEE 



Abstract — The ridgelet transform (Candes and Donoho, 
1999) was introduced as a sparse expansion for functions on 
continuous spaces that are smooth away from discontinu- 
ities along lines. In this paper, we propose an orthonormal 
version of the ridgelet transform for discrete and finite-size 
images. Our construction uses the finite Radon transform 
(FRAT) (Bolker, 1987; Matiis and Flusser, 1993) as a build- 
ing block. To overcome the periodization effect of a finite 
transform, we introduce a novel ordering of the FRAT co- 
efficients. Taking the one-dimensional wavelet transform 
on the projections of the FRAT in a special way results 
in the finite ridgelet transform (FRIT), which is invertible, 
non- redundant, and computed via fast algorithms. Further- 
more, our construction leads to a family of directional and 
orthonormal bases for images. Numerical results show that 
the FRIT is more effective than the wavelet transform in 
approximating and denoising images with straight edges. 

Keywords — wavelets, ridgelets, Radon transform, direc- 
tional bases, discrete transforms, non-linear approximation, 
image representation, image denoising. 

I. Introduction 

Many image processing tasks take advantage of sparse 
representations of image data where most information 
is packed into a small number of samples. Typically, 
these representations are achieved via invertible and non- 
redundant transforms. Currently, the most popular choices 
for this purpose are the wavelet transform [1], [2], [3] and 
the discrete cosine transform [4]. 

The success of wavelets is mainly due to the good per- 
formance for piecewise smooth functions in one dimension. 
Unfortunately, such is not the case in two dimensions. In 
essence, wavelets are good at catching zero-dimensional or 
point singularities, but two-dimensional piecewise smooth 
signals resembling images have one-dimensional singulari- 
ties. That is, smooth regions are separated by edges, and 
while edges are discontinuous across, they are typically 
smooth curves. Intuitively, wavelets in two dimensions are 
obtained by a tensor-product of one dimensional wavelets 
and they are thus good at isolating the discontinuity across 
an edge, but will not see the smoothness along the edge. 

To overcome the weakness of wavelets in higher dimen- 
sions, Candes and Donoho [5], [6] recently pioneered a new 
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system of representations named ridgelets which deal ef- 
fectively with line singularities in 2-D. The idea is to map 
a line singularity into a point singularity using the Radon 
transform [7]. Then, the wavelet transform can be used to 
effectively handle the point singularity in the Radon do- 
main. Their initial proposal was intended for functions 
defined in the continuous R 2 space. 

For practical applications, the development of discrete 
versions of the ridgelet transform that lead to algorithmic 
implementations is a challenging problem. Due to the ra- 
dial nature of ridgelets, straightforward implementations 
based on discretization of continuous formulae would re- 
quire interpolation in polar coordinates, and thus result in 
transforms that would be either redundant or can not be 
perfectly reconstructed. 

In [8], [9], [10], the authors take the redundant approach 
in defining discrete Radon transforms that can lead to in- 
vertible discrete ridgelet transforms with some appealing 
properties. For example, a recent preprint [10] proposes a 
new notion of Radon transform for data in a rectangular 
coordinate such that the lines exhibit geometrical faithful- 
ness. Their transform is invertible with a factor four over- 
sampled. However, the inverse transform is ill-conditioned 
in the presence of noise and requires an iterative approxi- 
mation algorithm. 

In this paper, we propose a discrete ridgelet transform 
that achieves both invertibility and non- redundancy. In 
fact, our construction leads to a large family of orthonormal 
and directional bases for digital images, including adaptive 
schemes. As a result, the inverse transform is numerically 
stable and uses the same algorithm as the forward trans- 
form. Because a basic building block in our construction is 
the finite Radon transform [11], which has a wrap-around 
(or aliased line) effect, our ridgelet transform is not geo- 
metrically faithful. The properties of the new transform 
are demonstrated and studied in several applications. 

As an illustration, consider the image denoising prob- 
lem where there exist other approaches that explore the 
geometrical regularity of edges, for example by chaining 
adjacent wavelet coefficients and then thresholding them 
over those contours [12]. However, the discrete ridgelet 
transform approach, with its "built-in" linear geometrical 
structure, provide a more direct way - by simply thresh- 
olding significant ridgelet coefficients - in denoising images 
with straight edges. 

The outline of this paper is as follows. In the next sec- 
tion we review the concept and motivation of ridgelets in 
the continuous domain. In Section III, we introduce the 
finite Radon transform with a novel ordering of coefficients 
as a key step in our discrete ridgelet construction. The 
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finite Radon transform is then studied within the theory 
of frames. The finite ridgelet transform is defined in Sec- 
tion IV, where the main result is a general family of or- 
thonormal transforms for digital images. In Sections V, 
we propose several variations on the initial design of the 
finite ridgelet transform. Numerical experiments are pre- 
sented in Section VI, where the new transform is compared 
with the traditional ones, especially the wavelet transform. 
We conclude in Section VII with some discussions and an 
outlook. 

II. Continuous Ridgelet Transform 

We start by briefly reviewing the ridgelet transform and 
showing its connections with other transforms in the con- 
tinuous domain. Given an integrable bivariate function 
/(x), its continuous ridgelet transform (CRT) in IR 2 is de- 
fined by [5], [6] 

CRT f {a,b,6)= f Tp aAe {x)f(x)dx, (1) 

where the ridgelets ip a ,b,o{&) in 2-D are defined from a 
wavelet-type function in 1-D tp(x) as 

i>a,bfi i x ) ~ a~ 1 ^ 2 ip({xi cos0 + 22 sin 9 — b)/a). (2) 

Figure 1 shows an example ridgelet function, which is 
oriented at an angle 6 and is constant along the lines 
x\ cos 9 + X2 sin 9 = const. 




0 0 



Fig. 1. An example ridgelet function ipa t b,9(xi,X2)- 

For comparison, the (separable) continuous wavelet 
transform (CWT) in R 2 of f(x) can be written as 

CWTf(a u a 2t bub2)= I ^ ai ,a 2 MM( x )f( x ) dx > (3) 
where the wavelets in 2-D are tensor products 

1Pa u a 2 Mto( X ) = ^ai.&ifrl^aa.bafe), (4) 

of 1-D wavelets, i) a ,b{t) = a~ l/2 tp{(t - b)/a). 1 

As can be seen, the CRT is similar to the 2-D CWT ex- 
cept that the point parameters (61,62) are replaced by the 

1 ln practice, however one typically enforces the same dilation scale 
on both directions thus leading to three wavelets corresponding to 
horizontal, vertical and diagonal directions. 



line parameters (6,0). In other words, these 2-D multiscale 
transforms are related by: 

\Vavelets: — ■* upscale, point— position » 

Ridgelets: -> i>scaie } line— position- 

As a consequence, wavelets are very effective in rep- 
resenting objects with isolated point singularities, while 
ridgelets are very effective in representing objects with sin- 
gularities along lines. In fact, one can think of ridgelets as 
a way of concatenating 1-D wavelets along lines. Hence the 
motivation for using ridgelets in image processing tasks is 
appealing since singularities are often joined together along 
edges or contours in images. 

In 2-D, points and lines are related via the Radon trans- 
form, thus the wavelet and ridgelet transforms are linked 
via the Radon transform. More precisely, denote the Radon 
transform as 

R f (9,t)= / f(x)6(xicos9 -\-x2sm6 -t)dx 7 (5) 

then the ridgelet transform is the application of a 1-D 
wavelet transform to the slices (also referred to as projec- 
tions) of the Radon transform, 

CRT f (a,b,6)= I il> a , b (t)R f {9,t)dt. (6) 

It is instructive to note that if in (6) instead of taking 
a 1-D wavelet transform, the application of a 1-D Fourier 
transform along t would result in the 2-D Fourier trans- 
form. More specifically, let F/(uj) be the 2-D Fourier trans- 
form of /(x), then we have 

F,(£cos6Usin0) = / e- j ^R f (9,t)dt (7) 
Jr 

This is the famous projection- slice theorem and is com- . 
monly used in image reconstruction from projection meth- 
ods [13], [14], The relations between the various transforms 
are depicted in Figure 2. 




Fig. 2. Relations between transforms. The ridgelet transform is the 
application of 1-D wavelet transform to the slices of the Radon 
transform, while the 2-D Fourier transform is the application of 
1-D Fourier transform to those Radon slices. 
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III. Finite Radon Transform 

A. Forward and Inverse Transforms 

As suggested in the previous section, a discrete ridgelet 
transform can be constructed using a discrete Radon trans- 
form. Numerous discretizations of the Radon transforms 
have been devised to approximate the continuous formu- 
lae [15], [13], [14], [16], [17], [18]. However, most of them 
were not designed to be invertible transforms for digital 
images. Alternatively, the finite Radon transform theory 
(which means transform for finite length signals) [11], [19], 
[20], [21] originated from combinatorics, provides an inter- 
esting solution. Also, in [22], a closely related transform 
is derived from the periodization of the continuous Radon 
transform. 

The finite Radon transform (FRAT) is defined as sum- 
mations of image pixels over a certain set of "lines" . Those 
lines are defined in a finite geometry in a similar way as 
the lines for the continuous Radon transform in the Eu- 
clidean geometry. Denote Z p = {0, 1, ... ,p — 1}, where 
p is a prime number. Note that Z p is a finite field with 
modulo p operations [23]. For later convenience, we denote 

z; = {o,i,...,p}. 

The FRAT of a real function / on the finite grid Z 2 is 
defined as 

r k [l] = FRAT f (k,l) = ± /M" W 

v p (t,j)eL fc| , 

Here Lk t i denotes the set of points that make up a line 
on the lattice Z 2 , or more precisely 




Fig. 3. Lines for the 7x7 FRAT. Parallel lines are grouped in each of 
the eight possible directions. Images in order from top to bottom, 
left to right are corresponding to the values of k from 0 to 7. In 
each image, points (or pixels) in different lines are assigned with 
different gray-scales. 



plane Z 2 . This means that for an input image f[i t j] with 
zero-mean, we have 

E r *W = -s £ /M = o v*ez;. (io) 

1=0 VP (ifJ - )€Z 2 



= {(m) : j = ki + l (mod p), i e Z p } , 0 < k < p, 
£p.i = {(U) * i^ P }- 

(9) 

Figure 3 shows an example of the finite lines L^i where 
points in the grid Z 2 are represented by image pixels. Note 
that due to the modulo operations in the definition of lines 
for the FRAT, these lines exhibit a "wrap around" effect. 
In other words, the FRAT treat the input image as one 
period of a periodic image. Later, we will present several 
ways to limit this artifact. 

We observe that in the FRAT domain, the energy is best 
compacted if the mean is subtracted from the image f[i,j] 
prior to taking the transform given in (8), which is assumed 
in the sequel. We also introduce the factor p -1 / 2 in order 
to normalize the /2-norm between the input and output of 
the FRAT. 

Just as in the Euclidean geometry, a line Lk,i on the 
affine plane Z 2 is uniquely represented by its slope or di- 
rection k € Z* (k = p corresponds to infinite slope or 
vertical lines) and its intercept I € Z p . One can verify that 
there are p 2 4- p lines defined in this way and every line 
contains p points. Moreover, any two distinct points on Z 2 
belong to just one line. Also, two lines of different slopes 
intersect at exactly one point. For any given slope, there 
are p parallel lines that provide a complete cover of the 



Thus, (10) explicitly reveals the redundancy of the 
FRAT: in each direction, there are only p — 1 indepen- 
dent FRAT coefficients. Those coefficients at p 4- 1 di- 
rections together with the mean value make up totally of 
(p-f- l)(p — 1) 4- 1 = p 2 independent coefficients (or degrees 
of freedom) in the finite Radon domain, as expected. 

By analogy with the continuous case, the finite back- 
projection (FBP) operator is defined as the sum of Radon 
coefficients of all the lines that go through a given point, 
that is 

FBP r (ij) = ± J2 r *M. (M)e^, (ii) 

VP <*,<)€*,,■ 

where Pij denotes the set of indices of all the lines that go 
through a point € Z 2 . More specifically, using (9) we 
can write 

P iyj = {(£,/) : l = j-ki (mod p), A: € Z p } U {(p,i)} . 

(12) 

From the property of the finite geometry Z 2 that every 
two points lie on exactly one line, it follows that every 
point in Z 2 lies on exactly one line from the set P{j, except 
for the point which lies on all p + 1 lines. Thus, by 
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substituting (8) into (11) we obtain 

FBP r (i,j) = \ £ E f[i',j'] 



(k,l)ePij (i',j')€L k ,, 



= /M- 



(13) 



So the back-projection operator defined in (11) indeed 
computes the inverse FRAT for zero- mean images. There- 
fore we have an efficient and exact reconstruction algorithm 
for the FRAT. Furthermore, since the FBP operator is the 
adjoint of the FRAT operator, the algorithm for the inverse 
of FRAT has the same structure and is symmetric with the 
algorithm for the forward transform. 

It is easy to see that the FRAT requires exactly p 3 ad- 
ditions and p 2 multiplications. Moreover, for memory ac- 
cess efficiency, [20] describes an algorithm for the FRAT in 
which for each projection k we need to pass through every 
pixel of the original image only once using p histogram- 
med, one for each summation in (8) of that projection. 
For images of moderate sizes, we observed that the actual 
computational time of the FRAT is compatible with other 
0(p 2 log(p 2 ) transforms, such as the 2-D FFT, where the 
leading constant can be large. For example, on a Sun Ultra 
5 computer, both the forward and inverse FRAT's take less 
than a second to compute on an image of size 257 x 257. 

B. Optimal Ordering of the Finite. Radon Transform Co- 
efficients 

The FRAT described in Section III-A uses (9) as a con- 
venient way of specifying finite lines on the Z 2 grid via two 
parameters: the slope k and the intercept I. However it is 
neither a unique nor the best way for our purpose. Let us 
consider a more general definition of lines on the finite Z 2 
plane as 

L'axt = {(m) eZ$ : ai + bj-t = Q (mod p)} , (14) 

where a, 6, t € Z p and (a, b) / (0, 0). 

This is by analogy with the line equation: x\cos6 + 
X2 sin 0 — t = 0 in IR 2 . Therefore, for a finite line defined 
as in (14), (a, 6) has the role of the normal vector, while t 
is the translation parameter. In this section, all equations 
involving line parameters are carried out in the finite field 
Z p , which is assumed in the sequel without the indication 
of mod p. 

It is easy to verify that for a fixed normal vector (a, 6), 
jl^ b t : t € Z p | is a set of p parallel lines in the Z 2 plane. 

This set is equal to the set of p lines {Lk,i '■ I € Z p } defined 
in (9) with the same slope fc, where k = — b~ l a for b ^ 0 
and k = p for b = 0. Moreover, the set of lines with the 
normal vector (a, b) is equal to the set of lines with the 
normal vector (na,n&), for each n = 1, 2, ... ,p — 1. 

With the general line specification in (14), we now define 



the new FRAT to be 

r atb [t) = FRAT f (a t b t t) = -jz £ f[ij]. (15) 

From the discussion above we see that a new FRAT pro- 
jection sequence: (r Q( £,[0],r a( b[l], . . . ,r a ^\p — 1]), is simply a 
reordering of a projection sequence (rfc[0], r^[l ],..., r k \p - 
1]) from (8). This ordering is important for us since we 
• later apply a 1-D wavelet transform on each FRAT projec- 
tion. Clearly, the chosen normal vectors (a, b) control the 
order for the coefficients in each FRAT's projection, as well 
as the represented directions of those projections. 

The usual FRAT described in Section III-A uses the set 
of (p + 1) normal vectors Ufc, where 



u k = (-/c, 1) 
u p = (1,0). 



for k = 0, 1 , . . . , p — 1 , and 



(16) 



In order to provide a complete representation, we need 
the FRAT to be defined as in (15) with a set of p + 1 normal 
vectors {(ctfc,bjfc) : k 6 Z*} such that they cover all p + 1 
distinct FRAT projections represented by {u k : k 6 Z*}. 
We have p — 1 choices for each of those normal vectors as 

(flfc, b k ) = nujt, 1 < n < p - 1. 

So what is the good choice for the p + 1 normal vectors 
of the FRAT? To answer this we first prove the following 
projection slice theorem for the general FRAT. A special 
case of this theorem is already shown in [20]. 

Defining W p = e ~ 2> /~^ 7r / p , the discrete Fourier trans- 
form (DFT) of a function / on Z p can be written as 



F[u,v} = - £ f[h3\Wp i+vj , 



r (i,j)6Zj 

and for FRAT projections on Z p as 



v p teZp 



(17) 



(18) 



Theorem 1 (Discrete projection-slice theorem) The 1-D 
DFT /? a) f>M of a FRAT projection r a ^[t] is identical to 
the 2-D DFT F[u,v] of f[i,j\ evaluated along a discrete 
slice through the origin at direction (a, b): 

RaA w ) = F[aw,bw). (19) 
Proof: Substituting (15) into (18) and using the fact that 

the set of p parallel lines |^a,6,t : ^ ^p} provides a com- 
plete cover of the plane Z 2 , we obtain 

R «,»H = ;£ E /MWT 

tez p {i,j)eL' a b l 

= - E f[i,i)w? {ai+bj) 

y (i,j)ezj 
= F[aw } bw}. 



* 
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Fig. 4. Example of a discrete Fourier slice (indicated by the black 
squares) with the best normal vector for that FRAT projection. 
In this example, p = 17 and the slope k — 11. The normal vector 
can be chosen as a vector from the origin to any other points 
on the Fourier slide. The best normal vector is (1,3) (the solid 
arrow) . 



From (19), we can see the role of the FRAT normal vec- 
tors (a, b) in the DFT domain: it controls the order of the 
coefficients in the corresponding Fourier slices. In particu- 
lar, F[a, b] equals to the first harmonic component of the 
FRAT projection sequence with the normal vector (a, b). • 
For the type of images that we are interested in, e.g. of 
natural scenes, most of the energy is concentrated in the 
low frequencies. Therefore in these cases, in order to en- 
sure that each FRAT projection is smooth or low frequency 
dominated so that it can be represented well by the wavelet 
transform, the represented normal vector (a, 6) should be 
chosen to be as "close" to the origin of the Fourier plane 
as possible. . 

Figure 4 illustrates this by showing an example of a dis- 
crete Fourier slice. The normal vector for the correspond- 
ing FRAT projection can be chosen as a vector from the 
origin to any other point on the Fourier slice. However, 
the best normal vector is selected as the closest point to 
the origin. The choice of the normal vector (a, b) as the 
closest point to the origin causes the represented direction 
of the FRAT projection to have the least "wrap around" 
due to the periodization of the transform. The effect of 
the new ordering of FRAT coefficient in the image domain 
is illustrated in Figure 5 for the same example projection. 
As can be seen, the "wrap around" effect is significantly 
reduced with the optimal ordering compared to the usual 
one. 

Formally, we define the set of p+1 optimal normal vectors 
{(a* k ,b* k ) :fc6Z p *}as follows 



{a* ki b* k ) = arg 



min 

(afc,6fc)€{nu fc :l<n<p- 
s.t. Cp(6 fc )>0 



1} 



ll(C P (a fc ),C p (& fc ))||. 



(20) 

Here C p (x) denotes the centralized function of period p: 
C p {x) = x -p.round(x/p). Hence, \\(C p (a k ),C p {b k ))\\ rep- 
resents the distance from the origin to the point (afc,6jt) 




(a) 




(b) 

Fig. 5. Lines for the FRAT projection as shown in Figure 4 using: 
(a) usual ordering, (b) optimal ordering. They both represent 
the same set of lines but with different orderings. The orderings 
are signified by the increasing of gray-scales. The arrows indicate 
the represented directions in each case. 



on the periodic Fourier plane as shown in Figure 4. The 
constraint C p (bk) > 0 is imposed in order to remove the 
ambiguity in deciding between (a, 6) and (—a, —6) as the 
normal vector for a projection. As a result, the optimal 
normal vectors are restricted to have angles in [0, 7r). We 
use norm-2 for solving (20). Minimization is simply done 
for each k € Z * by computing p — 1 distances in (20) and 
select the smallest one. Figure 6 shows an example of the 
optimal set of normal vectors. In comparison with the usual 
set of normal vectors {u k : k € Z*} as given in (16), the 
new set {(a£, b k ) : k 6 Z*} provides a much more uniform 
angular coverage. 

After obtaining the set of normal vectors {(a£,6£)}, we 
can compute the FRAT and its inverse with the same fast 
algorithms using histograrnmers described in Section III- 
A. For a given p, solving (20) requires 0(p 2 ) operations 
and therefore it is negligible compared to the transforms 
themselves. Furthermore, this can be pre-computed, thus 
only presents as a "one-time" cost. 

For the sake of simplicity, we write r k [t] for r 0 ^[£] in 
the sequel. In other words, from now we regard A; as an 
index in the set of optimal FRAT normal vectors rather 
than aslope. Likewise, the line L f a * b * t is simply rewritten 
as L;t ( t, for 0 < k < p, 0 < t < p. 
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(b) 

Fig. 6. The set of normal vectors, which indicate the represented directions, for the FRAT of size p = 17 using: (a) usual ordering; (b) optimal 
ordering. 



C. Frame Analysis of the FRAT 

Since the FRAT is a redundant transform, it can be stud- 
ied as a frame operator. In this section we will study the 
FRAT in more detail and reveal some of its properties in 
this frame setting. A detailed introduction to frames can 
be found in [24], [3]. 

Suppose that T is a linear operator from R N to R M , 
defined by 



{Tx) n = (x y (p n ), for n= 1,...,M. 



(21) 



The set {^ n }nLi c rN is called a frame of rN if there 
exist two constants A > 0 and B < oo such that 



M 

A\\xf < £|<x, <p n )? < B\\x\\ 2 , Vz € R N . (22) 

n=l 

where A and B are called the frame bounds. When A = B 
the frame is said to be tight. If the frame condition is 
satisfied then T is called a frame operator. It can be shown 
that any finite set of vectors that spans R N is a frame. The 
. frame bound ratio B/A indicates the numerical stability in 
reconstructing x from {Tx) n \ the tighter the frame, the 
more stable the reconstruction against coefficient noise. 

The frame operator can be regarded as a left matrix mul- 
tiplication with F, where F is an M x N matrix in which 
its nth row equals to <p n . The frame condition (22) can be 
rewritten as 



x T Ax < x T F T Fx < x T Bx } Vz € \ 



(23) 



Since F T F is symmetric, it is diagonalizable in an or- 
thonormal basis [25], thus (23) implies that the eigenval- 
ues of F T F are between A and B. Therefore, the tight- 
est possible frame bounds A and B are the minimum and 
maximum eigenvalues of F T F, respectively. In particular, 
a tight frame is equivalent to F T F = A * /, which means 
the transpose of F equals to its left inverse within a scale 
factor A. 

Now let us return to the FRAT. Since it is invertible it 
can be regarded as a frame operator in h(Z 2 ) with the 
frame {(p k ,i : k 6 Z*J G Z p } defined as 



where 8s denotes the characteristic function for the set 5, 
which means 5s[i,j] equals to 1 if £ S and 0 other- 
wise. Note that this frame is normalized since ||<£(fc,i)ll = 1. 
By writing images as column vectors, the FRAT can be re- 
garded as a left matrix multiplication with F = p~ l ^ 2 R, 
where {R}^,i), (ij) is the (p 2 + p) x p 2 incidence ma- 
trix of the affine geometry Z 2 : R(k,i), (ij) equals to 1 if 
€ Lk,i and 0 otherwise. 
Proposition 1: The tightest bounds for the FRAT frame 
{ipkj :k£Z*,le Z p } in l 2 (Z 2 ) are A = 1 and B = p 4- 1. 
Proof: From (23), these tightest bounds can be computed 
from the eigenvalues of C = F T F = p~ 1 R T R. Since R 
is the incidence matrix for lines in Z 2 , (R T R)^j^ ^ j/) 
equals the number of lines that go through both (i, j) and 
(*'»/)• Using the properties of the finite geometry Z 2 that 
every two points lie in exactly one line and that there are 
exactly p + 1 lines that go through each point, it follows 
that the entries of C equal to (p-f- l)p~ x along its diagonal 
and p _1 elsewhere. 

The key observation is that C is a circulant matrix, hence 
its eigenvalues can be computed as the p 2 -points discrete 
Fourier transform (DFT) on its first column c = ((p + 
.,p _1 ) [1] (§2.4.8). Writing c as 



c=(l,0,...,0)4-p- 1 .(l,l, 



we obtain, 



(24) 



DFT{c} = (1,1,..., l)+p.(l, 0, 0, . . . , 0) = (p+1, 1,1 1) 

where the DFT is computed for the Dirac and constant 
signals. 

Therefore the eigenvalues of C are p+ 1 and 1, the latter 
with multiplicity. of p 2 - 1. As a result, the tightest (nor- 
malized) frame bounds for FRAT as A = 1 and B = p+ 1. 



For reconstruction, the FBP defined in (11) can be repre- 
sented by a left multiplication with matrix p~ l l 2 B, where 
B(ij) t equals to 1 if (A;,/) 6 P{j and 0 otherwise. From 
the definition of P{j, we have 

= ^(t,i), (fc,i)» Vhj,kJ. 

So the transform matrices for the operators FRAT and 
FBP are transposed of each other. Let Z 2 denotes the 
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subspace of zero-mean images defined on Zp. Since the 
FBP is an inverse of the FRAT for zero-mean images, we 
have the following result. 

Proposition 2: On the subspace of zero-mean images Zp% 
the FRAT is a normalized tight frame with A — B = 1, 
which means 

/ = E£</.WU>WM, V/GZp 2 . (25) 

k=0 1=0 

Remark 1: It is instructive to note that constant images 
on Zp are eigenvectors of C = F T F with the eigenvalue 
p + 1. Taking constant images out leaves a system with 
all unity eigenvalues, or a tight frame on the remaining 
subspace. Thus, we have another interpretation of FRAT 
being a normalized tight frame for zero-mean images. 

By subtracting the mean from the image before apply- 
ing the FRAT, we change the frame bound ratio from p-f 1 
to 1 and obtain a tight frame. Consequently, this makes 
the reconstruction more robust against noise on the FRAT 
coefficients due to thresholding and/or quantization. This 
follows from the result in [26] that with the additive white 
noise model for the coefficients, the tight frame is optimal 
among normalized frames in minimizing mean-squared er- 
ror. 

IV. Orthonormal Finite Ridgelet Transform 

With an invertible FRAT and applying (6), we can ob- 
tain an invertible discrete ridgelet transform by taking the 
discrete wavelet transform (DWT) on each FRAT projec- 
tion sequence, (rfc[0], rjt[l], . . . , r>[p - 1]), where the direc- 
tion k is fixed. We call the overall result the finite ridgelet 
transform (FRIT). Figure 7 depicts these steps. 

Image FRA T domain FRIT domain 

/ k k 




DWT 

Fig. 7. Diagram for the FRIT. After taking the FRAT, a DWT 
is applied on each of the FRAT slices or projections where k is 
fixed. 

Typically p is not dyadic, therefore a special border han- 
dling is required. Appendix A details one possible way 
of computing the DWT for prime length signals. Due to 
the periodicity property of the FRAT coefficients for each 
direction, periodic wavelet transforms are chosen and as- 
sumed in this section. 

Recall that the FRAT is redundant and not orthogonal. 
Next we will show that by taking the 1-D DWT on the 
projections of the FRAT in a special way, we can remove 
this redundancy and obtain an orthonormal transform. 

Assume that the DWT is implemented by an orthogonal 
tree-structured filter bank with J levels, where Go and G\ 



are low and high pass synthesis filters, respectively. Then 
the family of functions: 

{g ( 0 J) [--2 J m} } g^l-Vm] : j = 1, . . . , J; m G z} 

is the orthogonal basis of the discrete-time wavelet series 
[1], Here denotes the equivalent synthesis filters at 
level or more specifically 

G[ j) (z) = G 1 {z 2i - 1 ) 3 flGo(z 2t ), j = l,...,J. 

The basis functions from are called the scaling func- 
tions, while all the others functions in the wavelet basis are 
called wavelet functions. Typically, the filter G\ is designed 
to satisfy the high pass condition, G\(z)\ z =\ = 0 so that 
the corresponding wavelet has at least one vanishing mo- 
ment. Therefore, G[ j) (z)\ z=l = 0, Vj = 1, . . . , J, which 
means all wavelet basis functions have zero mean. 

For a more general setting, let us assume that we have a 
collection of (p 4- 1) 1-D orthonormal transforms on R p 
(which can be the same), one for each projection k of 
FRAT, that have bases as 

[w$ : meZ p }, fe = 0, 1, . . . ,p. 

The only condition that we require for each of these bases 
can be expressed equivalently by the following lemma. 

Lemma 1 (Condition Z) Suppose that {wm : m 6 Z p } is 
an orthogonal basis for the finite-dimensional space M p , 
then the following are equivalent: 

1. This basis contains a constant function, say wo> i.e. 
wo [J] = const, VZ € Z p . 

2. All other basis functions, tu m , m = 1, . . . ,p — 1, have 
zero mean. 

Proof: Denote 1 = (1, 1, . . . , 1) 6 R p . If w 0 = cl, c ^ 0 
then from the orthogonality assumption that (wo,w m ) = 
0, we obtain £^ w m [l] = 0, Vm = 1, . . . ,p — 1. 

Conversely, assume that each basis function w m} 1 < 
m < p — 1, has zero mean. Denote S the subspace that 
is spanned by these functions and S x is its orthogonal 
complement subspace in R p . It is clear that S L has di- 
mension 1 with wq as its basis. Consider the subspace 
S 0 = {cl:c€R}. We have {cl,w m ) = c^ ( w m [!] = 
0, Vm = 1, . . . ,p - 1, thus So C 5 X .. On the other hand, 
dim(So) = dim(5 x ) - 1, therefore 5 X = 5 0 . This means 
w 0 is a constant function. ■ 

As shown before, the Condition Z is satisfied for all 
wavelet bases, or in fact any general tree-structured fil- 
ter banks where the all-lowpass branch is carried to the 
maximum number of stages (i.e. when only one scaling 
coefficient is left). 
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By definition, the FRIT can be written as 

FRIT f [k,m] = (FRAT f [k,-],w£ ) l\) 



(6Z„ 



= </, £ W mW ¥*./>• (26) 

Here {tpk,i} is the FRAT frame which is defined in (24). 
Hence we can write the basis functions for the FRIT as 
follows: 

PU^E^M^ (27) 

We can next prove the result on the orthogonality of a 
modified FRIT. 

Theorem 2: Given p + 1 orthonormal bases in l 2 (Z p ) 

(which can be the same): jii>m^ : m € Z p | , 0 < k < p, 
that satisfy the Condition Z then 

{pk,m : A: = 0,1,..., p; m = 1,2, ...,p- l}U{p 0 } 

is an orthonormal basis in Z 2 (Zp), where pk, m are defined in 
(27) and po is the constant function, p 0 [z, j] = 1/p, V(i, j) 6 

Proof: Let us consider the inner products between any two 
FRIT basis functions 

Using properties of lines in the finite geometry Z*, it is 
easy to verify that 

( 1 iffc = fc',Z = Z' 
**',!'> = < 0 iffc = fc',/^r (28) 
[ 1/p iffc^fc' 

Thus, when the two FRIT basis functions have the same 
direction, k = k\ then 

So the orthogonality of these FRIT basis functions comes 
from the orthogonality of the basis {w^ : m 6 Z p }. In 
particular, we see that pfc >m have unit norm.' Next, for 
the case when the two FRIT basis functions have different 
directions, k ^ k\ using (28) we obtain 



Finally, note that \J t L fc (Z) = Zp\ for all directions k (see 

(10)). So, together with the assumption that are con- 
stant functions, we see that all of the FRIT basis functions 
Pfc.O) = 0, 1, . . . ,p) correspond to the mean of the input 
image so we only need to keep one of them (in any direc- 
tion), which is denoted as po- The proof is now complete, a 

Remark 2: 1. An intuition behind the above result is 
that at each level of the DWT decomposition applied on the 
FRAT projections, all of the non-orthogonality and redun- 
dancy of the FRAT is pushed into the scaling coefficients. 
When the DWT's are taken to the maximum number of 
levels then all of the remaining scaling coefficients at dif- 
ferent projections are the same, hence we can drop all but 
one of them. The result is an orthonormal FRIT. 

2. We prove the above result for the general setting where 
different transforms can be applied on different FRAT pro- 
jections. The choice of transforms can be either adaptive, 
depending on the image, or pre-defined. For example, one 
could employ an adaptive wavelet packet scheme indepen- 
dently on each projection. The orthogonality holds as long 
as the "all lowpass" branch of the general tree-structured 
filter bank is decomposed to a single coefficient. All other 
branches would contain at least one highpass filter thus 
leading to zero-mean basis functions. 

3. Furthermore, due to the "wrap around" effect of the 
FRAT, some of its projections could contain strong periodic 
components so that a more oscillated basis like the DCT 
might be more efficient. Also note that from Theorem 1, 
if we apply the 1-D Fourier transform on all of the FRAT 
projections then we obtain the 2-D Fourier transform. For 
convenience, we still use the term FRIT to refer to the cases 
where other transforms than the DWT might be applied to 
some of the FRAT projections. 

To gain more insight into the construction for the or- 
thogonal FRIT basis, Figure 8 illustrates a simple example 
of the transform on a 2 x 2 block using the Haar wavelet. 
In this case, the FRIT basis is the same as the 2-D Haar 
wavelet basis, as well as the 2-D discrete Fourier basis. 




In this case, if either m or m! is non-zero, e.g. m ^ 0, 
then using the Condition Z of these bases, Yliez = 
0, it implies (p fc , m ,p fc%m /) = 0. 



Fig. 8. Illustration on the contraction of orthogonal FRIT basis for 
a 2 x 2 block using the Haar wavelet. Upper: Basis images for 
the FRAT. Lower: Basis images for the orthogonal FRIT. These 
images are obtained by taking the (scaled) Haar transform for 
each pair (corresponding to one projection) of the FRAT basis 
images. The constant image results from all projections and thus 
we can drop all but one of them. 
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V. Variations on the Theme 

A. Folded FRAT and FRIT 

The FRAT in the previous sections is defined with a pe- 
riodic basis over Z 2 ,. This is equivalent to applying the 
transform to a periodization of the input image /. There- 
fore relatively large amplitude FRAT coefficients could re- 
sult due to the possible discontinuities across the image 
borders. To overcome this problem, we propose a similar 
strategy as in the block cosine transform by extending the 
image symmetrically about its borders [3]. 



n p = 2n-l 




Fig. 9. Extending the image symmetrically about its borders in order 
to reduce the discontinuities across the image borders due to the 
periodization. 

Given that p is a prime number and p > 2, then p is odd 
and can be written as p = 2n — 1. Consider an n x n input 
image /[i, j], 0 < i,j < n. Fold this image with respect to 
the lines i = 0 and j = 0 to produce apxp image. f[i,j], 
in which (also see Figure 9) 

/M = /[|t|,bU -n<ij<n. (29) 

The periodization of f[i,j] is symmetric and continuous 
across the borders of the original image, thus eliminating 
the jump discontinuity that would have resulted from the 
periodic extension of Applying the FRAT to the 

f[i,j] results in 1) transform coefficients. Notice the 
new range for the pixel indices of the image /[£, j]. We 
will show that the FRAT coefficients of f[i,j] exhibit cer- 
tain symmetry properties so that the original image can be 
perfectly reconstructed by keeping exactly n 2 coefficients. 

Consider the 2-D DFT of f[ij] 

P[uM = \ E ffrtiK^- ■ 

P -n<ij<n 

Using the symmetry property of f[i,j] in (29), we obtain 

F[uM = F[\ul\v\)' 

Theorem 1 shows that the FRAT f a , fc [£], {-n < t < n) 
°f f[hj] can be- computed from the inverse 1-D DFT as 

?a, b [t) = -j= E Aa.6MW7"". 

v P -n<w<n 

where R a A w \ = F[aw,bw\. The symmetry of F[u,v\ thus 
yields 

RaA W \ = £a,&[M] and (30) 

R a>b [w} = A|a|,|6|H- (31) 



From (30) we have f a ^[t] = f a ,b[\t\] or each projection 
f a ^[t] is symmetric about t = 0, and (31) reveals the du- 
plications among those projections. In fact, with the set 
of optimal normal vectors in (20), except for two projec- 
tions indexed by (1,0) and (0, 1) (the vertical and horizon- 
tal projections, respectively) all other projections have an 
identical twin. By removing those duplications we are left 
with 2 + (p- 2)/2 = n-f 1 projections. For example, we can 
select the set of n + 1 independent projections as the ones 
with normal vectors in the first quadrant (refer to Figure 
6). Furthermore, as in (10), the redundancy among the 
projections of the folded FRAT can be written as 

^[0] + 2 E^.^W = -^ £ /M- (32) 

t=l VF -n<i,j<n 

The next proposition summarizes the above results. 
Proposition 3: The image f[ij] can be perfectly recon- 
structed from the following n 2 — 1 coefficients: 

fa* kt b* k [t] such that C p {al) > 0 and 0 < t < n, (33) 

and the mean of the image f[i,j]. 

To gain better energy compaction, the mean should be 
subtracted from the image f[i,j] previous to taking the 
FRAT. The set of independent coefficients in (33) is re- 
ferred as the folded FRAT of the image /[z, j]. 

However, orthogonality might be lost in the folded FRIT 
(resulting from applying 1-D DWT onn-f 1 projections of 
the folded FRAT), since the basis functions from a same 
direction of the folded FRAT could have overlap. Never- 
theless, if we loosen up the orthogonality constraint, then 
by construction, the folded FRAT projections (r a * f,*[i] : 
0 < t < n) are symmetric with respect to t = 0 and 
t = n — 1/2. This allows the use of folded wavelet trans- 
form with biorthogonal symmetric wavelets [27] or orthog- 
onal symmetric IIR wavelets [28]. We anticipate the folded 
FRIT has potential in block transforms (i.e. dividing the 
image into small blocks and applying FRIT to each block) 
where the border effect is more serious, and plan report the 
results in a forthcoming paper. 

B. Multilevel FRIT's 

In the FRIT scheme described previously, multiscale 
comes from the 1-D DWT. As a result, at each scale, there 
is a large number of directions, which is about the size of 
the input image. Moreover, the basis images of the FRIT 
have long support, which extend over the whole image. 

Here we propose a different scheme where the number of 
directions can be controlled, and the basis functions have 
smaller support. Assume that the input image has the size 
n x n, where n = p\p2 • ■ PjQ and pi are prime numbers. 
First, we apply the orthonormal FRIT to ny x rii non- 
overlapping subimages of size pi xpi , where n\ = p2 . . . pjq. 
Each sub-image is transformed into p\ - 1 "detail" FRIT 
coefficients plus a mean value. These mean values form an 
U\xn\ coarse approximate image of the original one. Then 
the process can be iterated on the coarse version up to J 
levels. The result is called as multilevel FRIT (MFRIT). 
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At each level, the basis functions for the "detail" MFRJT 
coefficients are obviously orthogonal within each block, and 
also with other blocks since they do not overlap. Further- 
more, these basis functions are orthogonal with the con- 
stant function on their block, and thus orthogonality holds 
across levels as well. Consequently, the MFRJT is an or- 
thonormal transform. 

By collecting the MFRJT coefficients into groups de- 
pending on their scales and directions, we obtain a 
subband-like decomposition with J scales, where level i 
has pi directions. When pi = 2, the orthonormal FRIT 
using the Haar DWT is the same as the 2 x 2 Haar DWT 
(see Figure 8). Therefore the MFRIT scheme includes the 
multilevel 2-D Haar DWT. In general, when pi > 2, the 
MFRIT offers more directions than the 2-D DWT and can 
be useful in certain applications such as texture analysis. 

VI. Numerical Experiments 
A. Non-linear Approximation 

Following the study of the efficiency of the ridgelet trans- 
form in the continuous domain using the truncated Gaus- 
sian functions [6], we first perform numerical compari- 
son on a 256 x 256 image of the function: }{x\,x 2 ) — 
l{x 2 <2x 1 +o.5}e~ x i* x 2 (see Figure 10(a)), using four 2-D 
transforms: DCT, DWT, FRAT and FRIT. The compari- 
son is evaluated in terms of the non-linear approximation 
power, i.e. the ability of reconstructing the original image, 
measured by signal- to- noise ratios (SNR's), using the N 
largest magnitude transform coefficients. For the FRAT 
and FRIT, we extend the image size to the next prime 
number, 257, by replicating the last pixel in each row and 
column. We use the orthogonal Symmlet wavelet with 4 
vanishing moments [24] for both the DWT and the FRIT. 

Our initial experiments indicate that in order to achieve 
good results, it is necessary to apply strong oscillated 
bases to certain FRAT projections to handle to the "wrap 
around" effect (refer to the remarks at the end of Sec- 
tion IV). For images with linear singularities, we find that 
in the FRAT domain, most of the image energy and sin- 
gularities are contained in the projections with the least 
"wrap around" (see Figure 13(b)). Therefore, without 
resorting to adaptive methods, we employ a simple, pre- 
defined scheme where the DWT is only applied to the pro- 
jections with ||(a£, £>£)|| < D, while the remaining projec- 
tions use the DCT. We use D — 3 in our experiments, which 
means in the tested FRIT, only 16 FRAT projections are 
represented by the DWT. Although the this FRIT contains 
most of Fourier-type basis functions, due to the concen- 
tration of energy mentioned above, the resulting nonlinear 
approximation images are mainly composed of the ridgelet- 
type functions that fit around the linear edge. 

Figure 10(b) display the comparison results. We omit the 
FRAT since its performance is much worse than the others. 
Clearly the FRIT achieves the best result, as expected from 
the continuous theory. Furthermore, the new ordering of 
the FRAT coefficients is crucial for the FRIT in obtaining 
good performance. 
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Fig. 10. (a) Test image: a truncated Gaussian image of size 256 x 256 
that represents the function f(x\ i X2) = l{x 2 <2x 1 +o.5} e_:Bl ~ a:2 - 
(b) Comparison of non-linear approximations using four different 
2-D transforms: DCT, DWT, FRIT with usual ordering and 
FRIT with optimal ordering. 




30 40 50 

Angle of the line singularity (6) 



Fig. 11. Non-linear approximation comparison at different orien- 
tation of the line singularity in the truncated Gaussian images 
fe(xux 2 ) = l{ Xl C o30+x 2 smO<o.3}e~ Xl ~ x * ■ In each case, we 
keep the most 0.5% significant coefficients. 
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(b) Using FRIT 

Fig. 12. From left to right, reconstructed images from the 32, 64, 128 and 256 most significant coefficients of the DWT and FRIT, out of 
65536 coefficients. 



We then compare the performance where the singular-, 
ity line varies its orientation. Consider the truncated 
Gaussian image again, using the function fo{x\,X2) = 

2 2 

1 {i 1 cos0+x 2 sin^<o.3}e~ Xl ~ a:2 . Due to the circular symme- 
try, we only need to consider 0 < 0 < 90°. Figure 11 shows 
the results where the FRIT (with optimal ordering) con- 
sistently outperforms both the DWT, more than 2 dB on 
the average, as well as the DCT. 

Our next test is a real image of size 256 x 256 with 
straight edges. Figure 12 shows the images obtained from 
non-linear approximation using the DWT and FRIT. As 
can be seen, the FRIT correctly picks up the edges using 
the first few significant coefficients and produces visually 
better approximated images. But let us point out that even 
this simple test image can not be represented as a summa- 
tion of a few "global" linear singularities (like the Gaussian 
truncated images), and thus it is not in the optimal class 
of the ridgelet transform. 

To gain an insight into the FRIT, Figure 13(a) shows the 
top five FRAT projections for the "object" image that con- 
tain most of the energy, measured in the /2-norm. Those 
projections correspond to the directions that have discon- 
tinuities across, plus the horizontal and vertical directions. 
Therefore, we see that at first the FRAT compacts most 
of the energy of the image into a few projections (see Fig- 
ure 13(b)), where the linear discontinuities create "jumps". 
Next, taking the 1-D DWT on those projections, which are 
mainly smooth, compacts the energy further into a few 
FRIT coefficients.. 



B. Image Denoising 

The motivation for the FRIT-based image denoising 
method is that in the FRIT domain, linear singularities 
of the image are represented by a few large coefficients, 
whereas randomly located noisy singularities are unlikely to 
produce significant coefficients. By contrast, in the DWT 
domain, both image edges and noisy pixels produce simi- 
lar amplitude coefficients. Therefore, a simple threshold- 
ing scheme for FRIT coefficients can be very effective in 
denoising images that are piecewise smooth away from sin- 
gularities along straight edges. 

We consider a simple case where the original image is 
contaminated by an additive zero-mean Gaussian white 
noise of variance a 2 . With an orthogonal FRIT, the noise 
in the transform domain is also Gaussian white of the same 
variance. Therefore it is appropriate to apply the thresh- 
olding estimators that were proposed in [29] to the FRIT 
coefficients. More specifically, our denoising algorithm con- 
sists of the following steps: 
Step 1: Applying FRIT to the noisy image. 
Step 2: Hard-thresholding of FR IT coefficients with the 
universal threshold T = ay/2logN where N = p 2 pixels. 
Step 3: Inverse FRIT of the thresholded coefficients. 
For an image which is smooth away from linear singulari- 
ties, edges are visually well restored after Step 3. However 
due to the periodic property of the FRIT, strong edges 
sometimes create "wrap around" effects which are visible 
in the smooth regions of the image. In order to overcome 
this problem, we optionally employ a 2-D adaptive filtering 
step. 
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Direction: (a.b) = (1,1); Energy - 47.66% 




Direction: (a,b) = (-1.1); Energy* 1.94% 




Direction 

(b) 

Fig. 13. (a) Top five FRAT projections of the "object" image that 
contain most of the energy, (b) Distribution of total input image 
energy among FRAT projections. Only the top 30 projections 
are shown in the descending order. 

Step 4: (Optional) Adaptive Wiener filtering to reduce 
the "wrap around" effect. 

In some cases, this can enhances the visual appearance of 
the restored image. 

The above FRIT denoising algorithm is compared with 
the analogous wavelet hard-thresholding method using the 
same threshold value. Figure 14 shows the denoising results 
on the real image. The FRIT is clearly shown to be more 
effective than the DWT in recovering straight edges, as well 
as in term of SNR's. 

VII. Conclusion and Discussion 

We presented a new family of discrete orthoriormal 
transforms for images based on the ridgelet idea. Own- 
ing to orthonormality, the proposed ridgelet transform is 
self- inverting - the inverse transform uses the same algo- 
rithm as the forward transform - and has excellent mi- 




susing DWT; SNR = 19.78 dB. 




(b)Using FRIT; SNR = 19.67 dB 




(c)Using FRIT and Wiener filter; 
SNR = 21.07 dB. 



Fig. 14. Comparison of denoising on the "object" image. 

merical stability. Experimental results indicate that the 
FRIT offers an efficient representation for images that are 
smooth away from line discontinuities or straight edges. 
A Matlab code implementing the transforms and experi- 
ments in this paper is available at an author's Web page 
www.ifp.uiuc.edu/~minhdo. 

However, it is important to emphasize that the ridgelet 
transform is only suited for discontinuities along straight 
lines. For complex images, where edges are mainly along 
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curves and there are texture regions (which generate 
point discontinuities), the ridgelet transform is not opti- 
mal. Therefore, a more practical scheme in employing the 
ridgelet transform is to first utilize a quad-tree division of 
images into suitable blocks where edges look straight and 
then apply the discrete ridgelet transform to each block. 
Another scheme is to use the ridgelet transform as the 
building block in a more localized construction such as the 
curvelet transform [30]. 
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Appendix 

I. Orthogonal Wavelet Transform for Non 
Dyadic Length Signals . 

In the construction of the orthonormal FRIT, we need 
wavelet bases for signals of prime length p. In addition, 
those bases have to satisfy the Condition Z in Lemma 1. 
Let n = 2 J be the nearest dyadic number to p that is 
smaller than or equal to p. Suppose that p — n is small, 
then one simple way of taking the wavelet transform on a 
sequence of p samples is to apply the usual wavelet trans- 
form on the first n samples and then extend it to cover the 
remaining p — n samples. 

Let {v m : m € Z n } to be the basis vectors of an or- 
thonormal wavelet transform of length n with J decom- 
position levels. We assume periodic extension is used to 
handle the boundary. Suppose that v$ corresponds to the 
single scaling coefficient or the mean value, then all other 
vectors must have zero mean (see Lemma 1). Denote 
be the vector with k entries, all equal to c. Consider the 
following p vectors defined in R p 

w 0 = (l {p) )/s 0 

w x = (i{p-i>,_ p+ i)/ ai 

w 2 = (l {p - 2} ,-p + 2,0)/ 52 
w p - n+l = (vi,0t p - n >) 

ti>p-l = (V n _i,0 {p - n} ). 

Here Sk is the scale factor such that \\wk\\ = 1. The or- 
thogonality of the new set {wk : k e Z p ] can be easily ver- 
ified given the fact that {v m : 1 < m < n} are orthonormal 
vectors with zero mean. Therefore, {w^ : h € Z p } is an or- 
thonormal basis for R p that satisfies the Condition Z, For a 
length p input vector x = (xn, xi, . . . , x p _i), the transform 
coefficients correspond to w^, where p — n<h<p — 1, can 
be computed efficiently using the usual DWT with J levels 
on the first n samples x f = (x 0) x u . . . , x n -i)- The last 
scaling coefficient is then replaced by p — n + 1 coefficients 



corresponding to the basis vectors to*, h = 0, . . . ,p - n. 
Thus the new basis in R p also has fast transforms. 
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